24 research outputs found

    An improved genome of the model marine alga Ostreococcus tauri unfolds by assessing Illumina de novo assemblies

    Get PDF
    Background: Cost effective next generation sequencing technologies now enable the production of genomic datasets for many novel planktonic eukaryotes, representing an understudied reservoir of genetic diversity. O. tauri is the smallest free-living photosynthetic eukaryote known to date, a coccoid green alga that was first isolated in 1995 in a lagoon by the Mediterranean sea. Its simple features, ease of culture and the sequencing of its 13 Mb haploid nuclear genome have promoted this microalga as a new model organism for cell biology. Here, we investigated the quality of genome assemblies of Illumina GAIIx 75 bp paired-end reads from Ostreococcus tauri, thereby also improving the existing assembly and showing the genome to be stably maintained in culture. Results: The 3 assemblers used, ABySS, CLCBio and Velvet, produced 95% complete genomes in 1402 to 2080 scaffolds with a very low rate of misassembly. Reciprocally, these assemblies improved the original genome assembly by filling in 930 gaps. Combined with additional analysis of raw reads and PCR sequencing effort, 1194 gaps have been solved in total adding up to 460 kb of sequence. Mapping of RNAseq Illumina data on this updated genome led to a twofold reduction in the proportion of multi-exon protein coding genes, representing 19% of the total 7699 protein coding genes. The comparison of the DNA extracted in 2001 and 2009 revealed the fixation of 8 single nucleotide substitutions and 2 deletions during the approximately 6000 generations in the lab. The deletions either knocked out or truncated two predicted transmembrane proteins, including a glutamate-receptor like gene. Conclusion: High coverage (>80 fold) paired-end Illumina sequencing enables a high quality 95% complete genome assembly of a compact ~13 Mb haploid eukaryote. This genome sequence has remained stable for 6000 generations of lab culture

    Life-Cycle and Genome of OtV5, a Large DNA Virus of the Pelagic Marine Unicellular Green Alga Ostreococcus tauri

    Get PDF
    Large DNA viruses are ubiquitous, infecting diverse organisms ranging from algae to man, and have probably evolved from an ancient common ancestor. In aquatic environments, such algal viruses control blooms and shape the evolution of biodiversity in phytoplankton, but little is known about their biological functions. We show that Ostreococcus tauri, the smallest known marine photosynthetic eukaryote, whose genome is completely characterized, is a host for large DNA viruses, and present an analysis of the life-cycle and 186,234 bp long linear genome of OtV5. OtV5 is a lytic phycodnavirus which unexpectedly does not degrade its host chromosomes before the host cell bursts. Analysis of its complete genome sequence confirmed that it lacks expected site-specific endonucleases, and revealed the presence of 16 genes whose predicted functions are novel to this group of viruses. OtV5 carries at least one predicted gene whose protein closely resembles its host counterpart and several other host-like sequences, suggesting that horizontal gene transfers between host and viral genomes may occur frequently on an evolutionary scale. Fifty seven percent of the 268 predicted proteins present no similarities with any known protein in Genbank, underlining the wealth of undiscovered biological diversity present in oceanic viruses, which are estimated to harbour 200Mt of carbon

    Analyse et modélisation de l'évolution des régions silencieuses de l'ADN

    No full text
    Les génomes eucaryotes sont constitués principalement de régions non codantes. Au sein même des séquences codantes, certains sites (majoritairement la position 3 des codons) sont synonymes et constituent donc, avec les régions non codantes, les régions silencieuses de l'ADN. Plusieurs modèles sont proposés pour décrire l'évolution des régions silencieuse de l'ADN et on oppose principalement des modèles neutres (biais mutationnel, conversion biaisée) à des modèles sélectionnistes. L'analyse de ces modèles d'évolution permet de déterminer leurs prédictions, qui peuvent être testées sur des données moléculaires, comme le taux du substitution ou le polymorphisme. Mon travail de thèse a consisté à construire un modèle d'évolution des régions silencieuses faisant intervenir de la sélection, (i) pour comparer différentes fonctions de cumul de l'effet de chaque site sur le génome et (ii) pour quantifier l'effet de la liaison entre les sites sélectionnés sur l'efficacité de la sélection. Ce travail de modélisation a été complété par des analyses de données génomiques en vue de tester certaines prédictions des modèles dévolution des régions silencieuses. Une des difficultés des modèles sélectionnistes est de définir une fonction de cumul de l'effet de chaque site sur la probabilité de passage d'un génome à la génération suivante. J'ai comparé les schémas de sélection sous des effets additifs et multiplicatifs des sites. La prise en compte de la liaison entre les sites sélectionnés nécessite une simulation extensive du processus d'évolution et permet d'appréhender le rôle de la recombinaison à l'échelle du génome sur l'efficacité de la sélection. D'après l'analyse de l'usage du nématode pour minimiser l'interférence entre les sites sélectionnés. L'analyse de données de polymorphisme humain permet de rejeter le modèle du biais mutationnel sur l'évolution de la composition en base Guanine et Cytosine du génome.LYON1-BU.Sciences (692662101) / SudocSudocFranceF

    Génomique des populations d'ostreococcus tauri (chlorophyta) (diversité et évolution du plus petit eucaryote photosynthétique)

    No full text
    Le plus petit eucaryote photosynthétique libre Ostreococcus tauri est identifié par la séquence codant pour l'ARN 18S. Pour estimer la diversité génétique associée à cette caractérisation, l'ADN total de 13 souches, échantillonnées au nord-ouest de la mer Méditerranée, a été séquencées en Illumina. L'analyse du polymorphisme génomique par alignement sur la séquence du génome de référence nous permet d'estimer la diversité génétique à 0.004 ce qui est similaire au taux de polymorphisme estimé pour des eucaryotes unicellulaires comme la levure par exemple. Ces données nous ont permis de construire la carte du taux de recombinaison de la population pour le génome d' O. tauri. L'analyse du patron de mutations, inféré à partir des fréquences des sites polymorphes, nous permet de tester les prédictions des différents mod les d'évolution à l'origine de la composition en bases du génome. Les données d' O. tauri ne sont compatibles qu'avec le mod le de conversion génique biaisée vers GC. L'analyse de la généalogie des génomes cytoplasmiques de cette population nous permet de confirmer la transmission uniparentale de la mitochondrie et de proposer une transmission biparentale du chloroplaste à la base de la lignée verte. Enfin, nous avons estimé le potentiel d'Illumina pour l'assemblage de novo du génome d' Ostreococcus tauri pour son utilisation à d'autres espè ces du phytoplancton eucaryote. Ce travail de thè se nous renseigne sur le niveau de diversité, le mode de reproduction et l'évolution d' O. tauri et permet de mieux comprendre l'histoire évolutive de la lignée verte et des mécanismes évolutifs qui l'ont fa çonnée.The smallest free-living photosynthetic eukaryote Otsreococcus tauri is identified based on its rDNA 18S encoding sequence. To estimate the genetic diversity associated with this molecular identification, whole DNA of 13 strains sampled in North-West Mediterranean Sea has been sequenced with Illumina. From whole genome polymorphisms analysis we estimated the genetic diversity at 0.004, which is similar to other unicellular eukaryotes such as yeasts. From single nucleotide polymorphisms we built the population scaled recombination map of O. tauri. Mutational pattern analysis inferred from polymorphisms frequencies allows us to test different evolutionary model at the origin of genome base composition. Our data is consistent with GC-biased gene conversion. Polymorphisms analysis along cytoplasmic genomes gives insight into the mode of transmission of the chloroplast and mitochondria along the green lineage. Last but not least we estimated the power of Illumina to assemble de novo the genome of O. tauri for its use to other eukaryotic phytoplankton. Overall this work gives insight into the diversity level and evolution of O. tauri. It fosters our understanding of the green lineage evolutionary history and mechanisms shaping this lineage.PARIS-BIUSJ-Biologie recherche (751052107) / SudocSudocFranceF

    A broad survey of recombination in animal mitochondrial DNA

    No full text
    Recombination in mitochondrial DNA (mtDNA) remains a controversial topic. Here we present a survey of 279 animal mtDNA data sets, of which 12 were from asexual species. Using four separate tests, we show that there is widespread evidence of recombination; for one test as many as 14.2% of the data sets reject a model of clonal inheritance and in several data sets, including primates, the recombinants can be identified visually. We show that none of the tests give significant results for obligate clonal species (apomictic parthogens) and that the sexual species show significantly greater evidence of recombination than asexual species. For some data sets, such as Macaca nemestrina, additional data sets suggest that the recombinants are not artifacts. For others, it cannot be determined whether the recombinants are real or produced by laboratory error. Either way, the results have important implications for how mtDNA is sequenced and used

    Clues about the genetic basis of adaptation emerge from comparing the proteomes of two Ostreococcus ecotypes (Chlorophyta, Prasinophyceae).

    No full text
    International audienceWe compared the proteomes of two picoplanktonic Ostreococcus unicellular green algal ecotypes to analyze the genetic basis of their adaptation with their ecological niches. We first investigated the function of the species-specific genes using Gene Ontology databases and similarity searches. Although most species-specific genes had no known function, we identified several species-specific functions involved in various cellular processes, which could be critical for environmental adaptations. Additionally, we investigated the rate of evolution of orthologous genes and its distribution across chromosomes. We show that faster evolving genes encode significantly more membrane or excreted proteins, consistent with the notion that selection acts on cell surface modifications that is driven by selection for resistance to viruses and grazers, keystone actors of phytoplankton evolution. The relationship between GC content and chromosome length also suggests that both strains have experienced recombination since their divergence and that lack of recombination on the two outlier chromosomes could explain part of their peculiar genomic features, including higher rates of evolution

    Vanishing GC-rich isochores in mammalian genomes.

    No full text
    To understand the origin and evolution of isochores-the peculiar spatial distribution of GC content within mammalian genomes-we analyzed the synonymous substitution pattern in coding sequences from closely related species in different mammalian orders. In primate and cetartiodactyls, GC-rich genes are undergoing a large excess of GC --> AT substitutions over AT --> GC substitutions: GC-rich isochores are slowly disappearing from the genome of these two mammalian orders. In rodents, our analyses suggest both a decrease in GC content of GC-rich isochores and an increase in GC-poor isochores, but more data will be necessary to assess the significance of this pattern. These observations question the conclusions of previous works that assumed that base composition was at equilibrium. Analysis of allele frequency in human polymorphism data, however, confirmed that in the GC-rich parts of the genome, GC alleles have a higher probability of fixation than AT alleles. This fixation bias appears not strong enough to overcome the large excess of GC --> AT mutations. Thus, whatever the evolutionary force (neutral or selective) at the origin of GC-rich isochores, this force is no longer effective in mammals. We propose a model based on the biased gene conversion hypothesis that accounts for the origin of GC-rich isochores in the ancestral amniote genome and for their decline in present-day mammals

    Phylogenetic position of the SSD -like sequence as inferred from the 18S rRNA sequences in 30

    No full text
    Outgroup sequence, ; OT95, (clade C); RCC356, RCC344 and MIC106, surface strains (clade A); RCC393 and RCC143, deep strains (clade B); RCC501, surface strain (clade D). Numbers on branches are support values (posterior probability).<p><b>Copyright information:</b></p><p>Taken from "Picoeukaryotic sequences in the Sargasso Sea metagenome"</p><p>http://genomebiology.com/2008/9/1/R5</p><p>Genome Biology 2008;9(1):R5-R5.</p><p>Published online 7 Jan 2008</p><p>PMCID:PMC2395239.</p><p></p
    corecore